Word Recognition by MLP-based Character Spotting and Dynamic Programming
نویسندگان
چکیده
This paper describes a method for handprinted word recognition, with the following characteristics: traditional pre-processing (relevant to single characters, obtained by word segmentation) is replaced by pre-processing based on piecewise normalization applied at whole words; feature extraction and character classification by MLP are performed in a sliding window fashion; the output string is matched with an ASCII word vocabulary by Dynamic Programming with the Levenshtein distance; a list of word candidates is issued. Afterwards, when the language is formally known, an appropriate parser can be applied to full Sentence Recognition. Preliminary tests on a medium size vocabulary show extremely promising results.
منابع مشابه
Keyword Spotting from Online Chinese Handwritten Documents using One-versus-All Character Classification Model
In this paper, we propose a method for text-query-based keyword spotting from online Chinese handwritten documents using character classi ̄cation model. The similarity between the query word and handwriting is obtained by combining the character classi ̄cation scores. The classi ̄er is trained by one-versus-all strategy so that it gives high similarity to the target class and low scores to the oth...
متن کاملZone-based Keyword Spotting in Bangla and Devanagari Documents
In this paper we present a word spotting system in text lines for offline Indic scripts such as Bangla (Bengali) and Devanagari. Recently, it was shown that zone-wise recognition method improves the word recognition performance than conventional full word recognition system in Indic scripts [29]. Inspired with this idea we consider the zone segmentation approach and use middle zone information ...
متن کاملA survey of document image word spotting techniques
Vast collections of documents available in image format need to be indexed for information retrieval purposes. In this framework, word spotting is an alternative solution to optical character recognition (OCR), which is rather inefficient for recognizing text of degraded quality and unknown fonts usually appearing in printed text, or writing style variations in handwritten documents. Over the p...
متن کاملDynamic Character Model Generation for Document Keyword Spotting
This paper proposes a novel method of generating statistical Korean Hangul character models in real time. From a set of grapheme average images we compose any character images, and then convert them to P2DHMMs. The nonlinear, 2D composition of letter models in Hangul is not straightforward and has not been tried for machine-print character recognition. It is obvious that the proposed method of ...
متن کاملAn investigation of the use of dynamic time warping for word spotting and connected speech recognition
Several variations on algorithms for dynamic time warping have been proposed for speech processing applications. In this paper two general algorithms that have been proposed for word spotting and connected word recognition are studied. These algorithms are called the fixed range method and the local minimum method. The characteristics and properties of these algorithms are discussed. It is show...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997